Scheduling Vector Straight Line Code on Vector Processors

نویسندگان

  • Christoph W. Kessler
  • Wolfgang J. Paul
  • Thomas Rauber
چکیده

We present an algorithm to schedule basic blocks of vector three-address-instructions. This algorithm is suited for a special class of vector processors containing a buuer (register le) which may be partitioned arbitrarily into vector registers by the user. The algorithm computes the best ratio of vector register spilling to strip mining, taking the vector length and the buuer size into consideration, as well as several machine parameters of the target architecture. We apply the algorithm to groups of vector instructions within a basic block that are quasiscalar, i.e. all vectors occurring in the group must have one xed length L.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SIMD Vectorization of Straight Line FFT Code

This paper presents compiler technology that targets general purpose microprocessors augmented with SIMD execution units for exploiting data level parallelism. FFT kernels are accelerated by automatically vectorizing blocks of straight line code for processors featuring two-way short vector SIMD extensions like AMD’s 3DNow! and Intel’s SSE 2. Additionally, a special compiler backend is introduc...

متن کامل

FFT Compiler Techniques

This paper presents compiler technology that targets general purpose microprocessors augmented with SIMD execution units for exploiting data level parallelism. Numerical applications are accelerated by automatically vectorizing blocks of straight line code to be run on processors featuring two-way short vector SIMD extensions like Intel’s SSE 2 on Pentium 4, SSE 3 on Intel Prescott, AMD’s 3DNow...

متن کامل

OPTIMAL PREEMPTIVE ONLINE SCHEDULING TO MINIMIZE lp NORM ON TWO PROCESSORS

We consider an on-line scheduling problem, where jobs arrive one by one to be scheduled on two identical parallel processors with preemption. The objective is to minimize the machine completion time vector with respect to the lp norm. We present a best possible deterministic on-line scheduling algorithm along with a matching lower bound.

متن کامل

Computing the Hough Transform on a Scan Line Array Processor (Image Processing)

This paper describes a parallel algorithm for a line-finding Hough transform that runs on a linearly connected, SIMD vector of processors. We show that a high-precision transform, usually considered to be an expensive global operation, can be performed efficiently, in two to three times real time, with only local communication on a long vector. The algorithm also illustrates a decomposition pri...

متن کامل

Line segment distribution of sketches for Persian signature recognition

A novel fast method for line segment extraction based on chain code representation of thinned sketches (or edge maps) is presented and exploited for Persian signature recognition. The method has a parallel nature and can be employed on parallel machines. It breaks the macro chains into several micro chains after applying shifting, smoothing and differentiating. The micro chains are then approxi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1991